robot motion planning
Robot Motion Planning using One-Step Diffusion with Noise-Optimized Approximate Motions
Aizu, Tomoharu, Oba, Takeru, Kondo, Yuki, Ukita, Norimichi
Robot Motion Planning using One-Step Diffusion with Noise-Optimized Approximate Motions Tomoharu Aizu 1, Takeru Oba 1, Y uki Kondo 1, and Norimichi Ukita 1 Abstract -- This paper proposes an image-based robot motion planning method using a one-step diffusion model. While the diffusion model allows for high-quality motion generation, its computational cost is too expensive to control a robot in real time. T o achieve high quality and efficiency simultaneously, our one-step diffusion model takes an approximately generated motion, which is predicted directly from input images. This approximate motion is optimized by additive noise provided by our novel noise optimizer . Unlike general isotropic noise, our noise optimizer adjusts noise anisotropically depending on the uncertainty of each motion element. Our experimental results demonstrate that our method outperforms state-of-the-art methods while maintaining its efficiency by one-step diffusion. I NTRODUCTION For robot motion planning, we have to compare the current state of a robot with its surrounding environment. Among various sensors for observing the environment, including the robot, optical sensors such as RGB and RGB-Depth cameras are widely used because of their wide availability, wide observation ranges, and so on. We call robot motion planning using camera images image-based robot motion planning [4], [10], [17].
From Configuration-Space Clearance to Feature-Space Margin: Sample Complexity in Learning-Based Collision Detection
Tubul, Sapir, Tamar, Aviv, Solovey, Kiril, Salzman, Oren
Motion planning is a central challenge in robotics, with learning-based approaches gaining significant attention in recent years. Our work focuses on a specific aspect of these approaches: using machine-learning techniques, particularly Support Vector Machines (SVM), to evaluate whether robot configurations are collision free, an operation termed ``collision detection''. Despite the growing popularity of these methods, there is a lack of theory supporting their efficiency and prediction accuracy. This is in stark contrast to the rich theoretical results of machine-learning methods in general and of SVMs in particular. Our work bridges this gap by analyzing the sample complexity of an SVM classifier for learning-based collision detection in motion planning. We bound the number of samples needed to achieve a specified accuracy at a given confidence level. This result is stated in terms relevant to robot motion-planning such as the system's clearance. Building on these theoretical results, we propose a collision-detection algorithm that can also provide statistical guarantees on the algorithm's error in classifying robot configurations as collision-free or not.
- Asia > Middle East > Israel (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise
Fu, Chaoyou, Zhang, Renrui, Wang, Zihan, Huang, Yubo, Zhang, Zhengye, Qiu, Longtian, Ye, Gaoxiang, Shen, Yunhang, Zhang, Mengdan, Chen, Peixian, Zhao, Sirui, Lin, Shaohui, Jiang, Deqiang, Yin, Di, Gao, Peng, Li, Ke, Li, Hongsheng, Sun, Xing
The surge of interest towards Multi-modal Large Language Models (MLLMs), e.g., GPT-4V(ision) from OpenAI, has marked a significant trend in both academia and industry. They endow Large Language Models (LLMs) with powerful capabilities in visual understanding, enabling them to tackle diverse multi-modal tasks. Very recently, Google released Gemini, its newest and most capable MLLM built from the ground up for multi-modality. In light of the superior reasoning capabilities, can Gemini challenge GPT-4V's leading position in multi-modal learning? In this paper, we present a preliminary exploration of Gemini Pro's visual understanding proficiency, which comprehensively covers four domains: fundamental perception, advanced cognition, challenging vision tasks, and various expert capacities. We compare Gemini Pro with the state-of-the-art GPT-4V to evaluate its upper limits, along with the latest open-sourced MLLM, Sphinx, which reveals the gap between manual efforts and black-box systems. The qualitative samples indicate that, while GPT-4V and Gemini showcase different answering styles and preferences, they can exhibit comparable visual reasoning capabilities, and Sphinx still trails behind them concerning domain generalizability. Specifically, GPT-4V tends to elaborate detailed explanations and intermediate steps, and Gemini prefers to output a direct and concise answer. The quantitative evaluation on the popular MME benchmark also demonstrates the potential of Gemini to be a strong challenger to GPT-4V. Our early investigation of Gemini also observes some common issues of MLLMs, indicating that there still remains a considerable distance towards artificial general intelligence. Our project for tracking the progress of MLLM is released at https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models.
- North America > United States (0.92)
- Europe > France > Île-de-France > Paris > Paris (0.14)
- Asia > Middle East > Jordan (0.04)
- (12 more...)
- Research Report (1.00)
- Workflow (0.67)
- Transportation > Passenger (1.00)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- (21 more...)
Using Implicit Behavior Cloning and Dynamic Movement Primitive to Facilitate Reinforcement Learning for Robot Motion Planning
Zhang, Zengjie, Hong, Jayden, Enayati, Amir Soufi, Najjaran, Homayoun
Reinforcement learning (RL) for motion planning of multi-degree-of-freedom robots still suffers from low efficiency in terms of slow training speed and poor generalizability. In this paper, we propose a novel RL-based robot motion planning framework that uses implicit behavior cloning (IBC) and dynamic movement primitive (DMP) to improve the training speed and generalizability of an off-policy RL agent. IBC utilizes human demonstration data to leverage the training speed of RL, and DMP serves as a heuristic model that transfers motion planning into a simpler planning space. To support this, we also create a human demonstration dataset using a pick-and-place experiment that can be used for similar studies. Comparison studies in simulation reveal the advantage of the proposed method over the conventional RL agents with faster training speed and higher scores. A real-robot experiment indicates the applicability of the proposed method to a simple assembly task. Our work provides a novel perspective on using motion primitives and human demonstration to leverage the performance of RL for robot applications.
- North America > Canada (0.14)
- Europe > Netherlands (0.14)
Sim2Plan: Robot Motion Planning via Message Passing between Simulation and Reality
Zhao, Yizhou, Zeng, Yuanhong, Long, Qian, Wu, Ying Nian, Zhu, Song-Chun
Simulation-to-real is the task of training and developing machine learning models and deploying them in real settings with minimal additional training. This approach is becoming increasingly popular in fields such as robotics. However, there is often a gap between the simulated environment and the real world, and machine learning models trained in simulation may not perform as well in the real world. We propose a framework that utilizes a message-passing pipeline to minimize the information gap between simulation and reality. The message-passing pipeline is comprised of three modules: scene understanding, robot planning, and performance validation. First, the scene understanding module aims to match the scene layout between the real environment set-up and its digital twin. Then, the robot planning module solves a robotic task through trial and error in the simulation. Finally, the performance validation module varies the planning results by constantly checking the status difference of the robot and object status between the real set-up and the simulation. In the experiment, we perform a case study that requires a robot to make a cup of coffee. Results show that the robot is able to complete the task under our framework successfully. The robot follows the steps programmed into its system and utilizes its actuators to interact with the coffee machine and other tools required for the task. The results of this case study demonstrate the potential benefits of our method that drive robots for tasks that require precision and efficiency. Further research in this area could lead to the development of even more versatile and adaptable robots, opening up new possibilities for automation in various industries.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Iowa (0.04)
Robot Motion Planning as Video Prediction: A Spatio-Temporal Neural Network-based Motion Planner
Zang, Xiao, Yin, Miao, Huang, Lingyi, Yu, Jingjin, Zonouz, Saman, Yuan, Bo
Neural network (NN)-based methods have emerged as an attractive approach for robot motion planning due to strong learning capabilities of NN models and their inherently high parallelism. Despite the current development in this direction, the efficient capture and processing of important sequential and spatial information, in a direct and simultaneous way, is still relatively under-explored. To overcome the challenge and unlock the potentials of neural networks for motion planning tasks, in this paper, we propose STP-Net, an end-to-end learning framework that can fully extract and leverage important spatio-temporal information to form an efficient neural motion planner. By interpreting the movement of the robot as a video clip, robot motion planning is transformed to a video prediction task that can be performed by STP-Net in both spatially and temporally efficient ways. Empirical evaluations across different seen and unseen environments show that, with nearly 100% accuracy (aka, success rate), STP-Net demonstrates very promising performance with respect to both planning speed and path cost. Compared with existing NN-based motion planners, STP-Net achieves at least 5x, 2.6x and 1.8x faster speed with lower path cost on 2D Random Forest, 2D Maze and 3D Random Forest environments, respectively. Furthermore, STP-Net can quickly and simultaneously compute multiple near-optimal paths in multi-robot motion planning tasks
Narayanan
We address the problem of finding shortest paths in graphs where some edges have a prior probability of existence, and their existence can be verified during planning with time- consuming operations. Our work is motivated by real-world robot motion planning, where edge existence is often expensive to verify (typically involves time-consuming collision-checking between the robot and world models), but edge existence probabilities are readily available. The goal then, is to develop an anytime algorithm that can return good solutions quickly by somehow leveraging the existence probabilities, and continue to return better-quality solutions or provide tighter suboptimality bounds with more time. While our motivation is fast and high-quality motion planning for robots, this work presents two fundamental contributions applicable to generic graphs with probabilistic edges. They are: a) an algorithm for efficiently computing all relevant shortest paths in a graph with probabilistic edges, and as a by-product the expected shortest path cost, and b) an anytime algorithm for evaluating (verifying existence of) edges in a collection of paths, which is optimal in expectation under a chosen distribution of the algorithm interruption time. Finally, we provide a practical approach to integrate a) and b) in the context of robot motion planning and demonstrate significant improvements in success rate and planning time for a 11 degree-of-freedom mobile manipulation planning problem. We also conduct additional evaluations on a 2D grid navigation domain to study our algorithm's behavior.
An In-Memory Physics Environment as a World Model for Robot Motion Planning
This paper investigates the utilization of a physics simulation environment as the imagination of a robot, where it creates a replica of the detected terrain in a physics simulation environment in its memory, and "imagines" a simulated version of itself in that memory, performing actions and navigation on the terrain. The physics of the environment simulates the movement of robot parts and its interaction with the objects in the environment and the terrain, thus avoiding the need for explicitly programming many calculations.
Modern Robotics, Course 4: Robot Motion Planning and Control Coursera
About this course: Do you want to know how robots work? Are you interested in robotics as a career? Are you willing to invest the effort to learn fundamental mathematical modeling techniques that are used in all subfields of robotics? If so, then the "Modern Robotics: Mechanics, Planning, and Control" specialization may be for you. This specialization, consisting of six short courses, is serious preparation for serious students who hope to work in the field of robotics or to undertake advanced study.
- Education > Educational Technology > Educational Software > Computer Based Training (0.40)
- Education > Educational Setting > Online (0.40)
Custom Processor Speeds Up Robot Motion Planning by Factor of 1,000
If you've ever seen a live robot manipulation demo, you've almost certainly noticed that the robot probably spends a lot of time looking like it's not doing anything. It's tempting to say that the robot is "thinking" when this happens, and that might even be mostly correct: odds are that you're watching some poor motion-planning algorithm try and figure out how to get the robot's arm and gripper to do what it's supposed to do without running into anything. This motion planning process is both one of the most important skills a robot can have (since it's necessary for robots to "do stuff"), and also one of the most time and processor intensive. At the RSS 2016 conference this week, researchers from the Duke Robotics group at Duke University in Durham, N.C., are presenting a paper about "Robot Motion Planning on a Chip," in which they describe how they can speed up motion planning by three orders of magnitude while using 20 times less power. How? Rather than using general purpose CPUs and GPUs, they instead developed a custom processor that can run collision checking across an entire 3D grid all at once.